AITopics | listwise deletion

Collaborating Authors

listwise deletion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generative AI in Sociological Research: State of the Discipline

Alvero, AJ, Stoltz, Dustin S., Stuhler, Oscar, Taylor, Marshall

arXiv.org Artificial IntelligenceDec-3-2025

Generative artificial intelligence (GenAI) has garnered considerable attention for its potential utility in research and scholarship. A growing body of work in sociology and related fields demonstrates both the potential advantages and risks of GenAI, but these studies are largely proof-of-concept or specific audits of models and products. We know comparatively little about how sociologists actually use GenAI in their research practices and how they view its present and future role in the discipline. In this paper, we describe the current landscape of GenAI use in sociological research based on a survey of authors in 50 sociology journals. Our sample includes both computational sociologists and non-computational sociologists and their collaborators. We find that sociologists primarily use GenAI to assist with writing tasks: revising, summarizing, editing, and translating their own work. Respondents report that GenAI saves time and that they are curious about its capabilities, but they do not currently feel strong institutional or field-level pressure to adopt it. Overall, respondents are wary of GenAI's social and environmental impacts and express low levels of trust in its outputs, but many believe that GenAI tools will improve over the next several years. We do not find large differences between computational and non-computational scholars in terms of GenAI use, attitudes, and concern; nor do we find strong patterns by familiarity or frequency of use. We discuss what these findings suggest about the future of GenAI in sociology and highlight challenges for developing shared norms around its use in research practice.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.16884

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)

Industry:

Education (0.88)
Government (0.68)
Law > Environmental Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Which Imputation Fits Which Feature Selection Method? A Survey-Based Simulation Study

Schwerter, Jakob, Romero, Andrés, Dumpert, Florian, Pauly, Markus

arXiv.org Machine LearningDec-18-2024

Tree-based learning methods such as Random Forest and XGBoost are still the gold-standard prediction methods for tabular data. Feature importance measures are usually considered for feature selection as well as to assess the effect of features on the outcome variables in the model. This also applies to survey data, which are frequently encountered in the social sciences and official statistics. These types of datasets often present the challenge of missing values. The typical solution is to impute the missing data before applying the learning method. However, given the large number of possible imputation methods available, the question arises as to which should be chosen to achieve the 'best' reflection of feature importance and feature selection in subsequent analyses. In the present paper, we investigate this question in a survey-based simulation study for eight state-of-the art imputation methods and three learners. The imputation methods comprise listwise deletion, three MICE options, four \texttt{missRanger} options as well as the recently proposed mixGBoost imputation approach. As learners, we consider the two most common tree-based methods, Random Forest and XGBoost, and an interpretable linear model with regularization.

artificial intelligence, imputation method, machine learning, (14 more...)

arXiv.org Machine Learning

2412.1357

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
Europe > Germany > Hesse > Darmstadt Region > Wiesbaden (0.04)
(12 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government (1.00)
Education > Educational Setting (0.93)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Graphical Models for Inference with Missing Data

Neural Information Processing SystemsMar-13-2024, 14:42:14 GMT

We address the problem of recoverability i.e. deciding whether there exists a consistent estimator of a given relation Q, when data are missing not at random. We employ a formal representation called'Missingness Graphs' to explicitly portray the causal mechanisms responsible for missingness and to encode dependencies between these mechanisms and the variables being measured. Using this representation, we derive conditions that the graph should satisfy to ensure recoverability and devise algorithms to detect the presence of these conditions in the graph.

data problem, factorization, recoverability, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Iowa > Story County > Ames (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback

Regression with Missing Data, a Comparison Study of TechniquesBased on Random Forests

Gómez-Méndez, Irving, Joly, Emilien

arXiv.org Machine LearningOct-18-2021

Random forests and recursive trees are widely used in applied statistics and computer science. The popularity of recursive trees relies on several factors: their easy interpretability, the fact that they can be used for both regression and classification tasks, the small number of hyper-parameters to be tuned and finally, their non-parametric nature that allows their use to infer arbitrarily complex relations between the input and the output space. A random forest combines several randomized trees, improving the prediction accuracy at a cost of a slight lost in interpretation. This technique is easily parallelizable which has made it one of the most popular tools for handling high dimensional data sets. It has been successfully involved in various practical problems, including chemioinformatics, ecology, 3D object recognition, bioinformatics and econometrics. Biau and Scornet (2016) present a detailed list of applications as well as a review on random forests. In the present work we have focused on the ability of random forests to deal with missing values.

algorithm, missing-data mechanism, random forest, (14 more...)

arXiv.org Machine Learning

2110.09333

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Missing Data Handling

#artificialintelligenceOct-10-2021, 04:49:37 GMT

Real-world data is messy and usually holds a lot of missing values. Missing data can skew anything for data scientists and, A data scientist doesn't want to design biased estimates that point to invalid results. Behind, any analysis is only as great as the data. Missing data appear when no value is available in one or more variables of an individual. Due to Missing data, the statistical power of the analysis can reduce, which can impact the validity of the results.

dataset, listwise deletion, missingness, (13 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Graphical Models for Inference with Missing Data

Mohan, Karthika, Pearl, Judea, Tian, Jin

Neural Information Processing SystemsDec-31-2013

We address the problem of deciding whether there exists a consistent estimator of a given relation Q, when data are missing not at random. We employ a formal representation called `Missingness Graphs' to explicitly portray the causal mechanisms responsible for missingness and to encode dependencies between these mechanisms and the variables being measured. Using this representation, we define the notion of \textit{recoverability} which ensures that, for a given missingness-graph $G$ and a given query $Q$ an algorithm exists such that in the limit of large samples, it produces an estimate of $Q$ \textit{as if} no data were missing. We further present conditions that the graph should satisfy in order for recoverability to hold and devise algorithms to detect the presence of these conditions.

artificial intelligence, machine learning, recoverability, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.29)

Genre: Research Report (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback